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Article Info ABSTRACT 

Article history: Drowsy driving is a major cause of road accidents worldwide, necessitating 
i the development of effective drowsiness detection systems. Each year, there 

Received Nov 15, 2022 are more accidents and fatalities than ever before for a variety of causes. For 

Revised Nov 29, 2022 instance, there were 22,952 fatalities and 79,545 injuries as a result of nearly 

Accepted Dec 21, 2022 66,500 vehicle accidents in the last 10 years. In this paper, we propose a novel 


approach for detecting drowsiness based on behavioral cues captured by a 


digital camera and utilizing the multi-task cascaded convolutional neural 
Keywords: network (MTCNN) deep learning algorithm. A high-resolution camera 
records visual indications like closed or open eye movement to base the 
technique on the driver's behavior. In order to measure a car user's weariness 
in the present frame of reference, eyes landmarks are evaluated, which results 


Driver drowsiness detection 
Eyes aspect ratio 


Multi-task cascaded in the identification of a fresh constraint known as "eyes aspect ratio." A 
convolutional neural networks picture with a frame rate of 60 frames per second (FPS) and a resolution of 
OpenCV and Dlib 4,320 eyeballs was used. The accuracy of sleepiness detection was more than 


99.9% in excellent lighting and higher than 99.8% in poor lighting, according 
to testing data. The current study did better in terms of sleepiness detection 
accuracy than a lot of earlier investigations. 
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1. INTRODUCTION 

Today, a wider range of jobs require the ability to pay attention. Those who employ in the 
transportation sector like car and truck drivers, steersmen, and airline pilots must maintain a constant watch on 
the road in order to react swiftly to any unforeseen situations (such as vehicle accidents, dogs getting, and loose 
while driving) [1]. Driver fatigue brought on by spending extended periods of time behind the wheel reduces 
the chance of a review. According to research results presented at the International Workshop on relax 
disorders, drowsy driving contributes to 30% of traffic accidents [2]. 

An experiment using a driving simulation revealed results that were published in the English magazine 
“what car?” they concluded that a defective driver poses a considerably greater risk than someone whose blood 
alcohol content is 25% higher than the allowable limit. Driver weariness can result in micronaps (such as a loss 
of concentration or a catnap lasting between one and thirty seconds) as well as sleeping behind the wheel [3]. 
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Each year's National Relaxation Structure in 2012 oversleep according to the American poll, 20% of employees 
have driven when fatigued at least once per month in the preceding year. About 40% of learning operators said 
that they invite fact-guided sleepy at least as quickly as they did earlier in the month. More than 25% of 
individuals who had been helped to get fatigued had slept off and rested. Studies have shown that fatigue may 
impair driving ability on a par with or even more than drinking; these facts and their consequences serve as 
descriptions of the problem [4]. Due to sluggish driving demands that analytical study in this area continues, 
ways of life have to be conserved. 

Previous research on the detection of alcohol abuse shows that the use of devices and learning 
techniques may help prevent and also minimize these collisions and their consequences. These methods, 
together with the application of driving performance indicators based on the distinctive idea of street 
environment and directing, may accurately identify the distinction between a trashed driver and a free 
driver [4]. The suggested system uses the openCV and Dlib libraries in the Python integrated development 
environment (IDE) to continually take images and measure the condition of the eye, mouth, and head in 
accordance with the defined methodology. 


2. LITERATURE REVIEW 

According to the source of the data used for the drowsiness measuring, there seem to be two different 
ways to assess a driver's levels of drowsiness. While some systems monitor the state of the vehicle to determine 
the driver's level of fatigue, other systems use metrics collected directly from the driver. Lane departures or 
steering wheel behaviors are the metrics that are most frequently studied in studies of the vehicle condition and 
its link to weariness. 

Gromer et al. [3] presents the creation of a low-cost electrocardiogram (ECG) sensor for detecting 
tiredness using heart rate variability (HRV) data. Designing hardware and software is part of the job. On a 
printed circuit board (PCB), the hardware was created PCB that was meant to be used as a shield for the 
Arduino. A low-pass filtered double inverted ECG channel is included on the PCB, as well as there are double 
outputs that are analog for Arduino is a microcontroller board that may be to connect to and control the 
digital-to-analog converter. The signal of ECG in digital format is sent for processing a NVidia embedded 
computer, which includes detection of the QRS complexes, heartbeat, HRV, as well as visuals capabilities. A 
special PCB design catches the ECG in this development. For quick prototyping, the PCB may be plugged into 
a normal Arduino board. The signal capture was upgraded to make it more dependable and usable in an 
automobile setting. However, an ECG can still be detected if the electrodes are connected incorrectly. A two- 
channel approach is currently used for signal capture. The signal processing was done with the aid of a modular 
piece of software that ran an algorithm. It is utilized to identify a complex of QRS pattern and to put HR and 
HRV into practice, which are generated from the complex of QRS. 

Research by Solaz et al. [5], robustness in the face of various sorts of users and conditions has been 
investigated in the current study and use small system in vehicle cameras with a lot of movement is presented. 
Images will be analyzed to determine the chest/abdominal movement of the driver to calculate breathing rate. 
These data will be examined by using real-time a proven a movement-analysis algorithm and determines 
driver’s the level of weariness and drowsiness. In this experiment proved using a single camera on board that 
image technology may be used to determine a car driver's breathing rate. First experiment revealed that thoracic 
respiratory movement may be monitored using a depth map created using inexpensive infrared cameras. 

Lee et al. [6] explained the goal of his study is to look at the sturdy and recognizable HRV signal 
patterns obtained from ECG or photoplethysmogram (PPG) sensors worn on the body for detecting drowsiness 
in the driver. The three varieties of recurrence plots (Bin-RP, Cont-RP, and ReLU-RP). By extracting and 
learning sleepiness, ReLU-RP was able to discriminate between sleepy and awake states when utilized as the 
input to convolutional neural network (CNN). R-R interval of heartbeats have features with a pattern of vertical 
(or horizontal) lines. To determine the efficacy of the presented models in detecting realistic sleepiness. 

Chellappa et al. [7] suggest a system which was built for four-wheelers, and it detects and notifies the 
driver's tiredness or drowsiness. The suggested solution would employ a 5-megapixel Raspbian camera to 
record and evaluate photos of the driver's face and eyes in order to detect driver drowsiness. This method 
fatigue is evaluated by applying a haar cascade classifier to recognize eye and face cues, particularly facial 
features and computing between the eyes' euclidean distance to calculate the eye aspect ratio (EAR). The ability 
to assess sleepiness level had been aided by faces in every frame and reliable eye detection. The frequency of 
head tilting and eye blinking is appropriately assessed and contributes to indicate sleepiness. 

Naqvi et al. [8] recommended a project including a universal serial bus (USB) camera for an 
eye-blink monitoring system, as well as a buzzer that informs the driver when they are drowsy. Global 
positioning system (GPS) may be used to track the driver's whereabouts. The suggested web application design 
will allow the administrator to adjust the system's parameters and send messages to a colleague. The project’s 
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goal is to assist in the cost-effective solution of real-world problems. The buzzer is sounded if the driver 
becomes tired and shuts his eyes for longer than a second. 

Panicker and Nair et al. [9] proposed worked based on computer vision methods provides a unique 
methodology for the detection of open eyes that may be employed sleepiness in the driver studies. A 
low-resolution camera is used to capture the driver footage in the suggested technique. There are three steps to 
the proposed drowsiness detection system. Face recognition is carried out. In the initial stage utilizing elliptical 
approximation and template matching algorithms. The open eye is recognized in the second step utilizing the 
suggested pattern study of the iris and sclera approach. The percentage of the eye closer (PERCLOS) metric is 
used to determine the driver's drowsy level in the third stage. For a variety of eye positions, it has been shown 
to operate well with low-resolution pictures. On photographs with variable illumination and complicated 
backgrounds, this approach produces good results. 


3. MATERIALS AND METHOD 

A purpose for this research focuses on create a system of estimating a driver's level of drowsiness 
using a series of images that are captured in a way that makes person’s face looks visual. The driver-based 
advanced driver assistance system (ADAS) [10] that the drowsiness detecting structure advanced within such 
study is a part of two key constraints: early detection and a reduction in the amount of false positives. 
Determining the frame rate which a camera must provide to the system in order to record the driver is crucial. 
Due to the large number concerning frames per second (FPS) can be examined, a high frame rate will 
overburden the machine [11], [12]. However, a low FPS might have a significant impact on the system's 
performance. To understand aspects to sequence image which extremely brief duration, such blinks, in this 
field, there must be a sufficient number of FPS. 


3.1. Dataset descriptions 

A facial data had been used from (Kaggle) website, which in turn serves the public interest and students 
of knowledge and provides them with ready-to-use data [13]. The training showed that the efficiency in the 
prediction was not satisfactory, after that the addition of 1000 images and merged it with the data that had 
previously utilized and then re-entered it into the algorithm and repeated the training phase, which usually takes 
time to extract the results where the first training took more than two days and the second training for more than 
3 days. Also, no satisfactory results were shown, as enhanced the data with a large group of images, where the 
number of images reached 6000 images consisting of several different ages (the elderly, middle-aged, and 
scattered) and this includes both genders, where the results that were used had been extracted [14]. Drowsiness 
detection has an algorithm which is based on the MATLAB programming language to identify and detect the 
driver drowsiness. In this paper describe the main tools used for this algorithm. 


3.2. Dlib open source library 

Functions for face and landmark detection are available in the Dlib library. While Dlib's landmark 
detection is based on Kazemi's model, histogram-oriented approaches histogram of oriented gradients (HOG) 
are used for face detection [15], [16]. It provides 68 distinct feature points from a face. The placements of the 
68 points that were found on a face are shown in Figure 1. 
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Figure 1. The 68 points positions identified on a face 
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3.3. Eye aspect ratio 

A scalar variable called EAR reacts, notably when the eyes open and close [17]. 
Pandey and Muppalaneni [18] created a drowsiness identification and accident prevention system based on 
blink length and their system has demonstrated high accuracy on a dataset of yawning (YawDD). They 
employed an EAR threshold of 0.4 to differentiate between the open and closed states of the eye. The trend of 
time required to determine a typical EAR value for one blink is shown in Figure 2 [19]. We can see that the 
EAR value changes quickly during the flashing operation, either increasing or decreasing [20]. In accordance 
with the findings of earlier research, we employed threshold values to pinpoint the abrupt rise or fall in EAR 
values brought on by blinking. 

According to earlier study, we are aware that using a threshold of 0.3 is advantageous for the current 
project. Numerous more methods for blink detection employing image processing techniques have also been 
proposed in the literature, in addition to this one. They do have certain limitations, though, such stringent 
requirements for picture and text quality, which are hard to get around. In our experiment, we chose EAR 
thresholds of 0.2 and 0.3 based on the findings of prior studies [21]. The advantage of being able to recognize 
faces from a distance is that the EAR formula is indifferent to the direction and proximity of the face. By 
entering the six coordinates around the eyes in Figures 3(a) and (b) into (1) and (2), the EAR value is calculated. 
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Figure 2. The initial blink is recognized using a single blink detection technique between frames 60 and 65 


Figure 3. Eyes examples that are (a) open eyes and (b) closed eyes, and have facial landmarks (P1-P6) 


3.4. Facial landmark recognition algorithm 

Facial landmark recognition algorithms are computer vision techniques designed to identify and locate 
specific facial landmarks or keypoints on a human face. These algorithms play a crucial role in various 
applications such as face analysis, facial expression recognition, face tracking, augmented reality, and 
drowsiness detection systems [22]. Finding the face in the image as well as identifying the points which create 
the face structure are the goals of facial landmark. The facial landmark recognition algorithm will identify 68 
major points in accordance with the coordinates (x, y) that make up the human face in order to complete these 
two tasks [23], hence identifying the mouth, left eyebrow, right eyebrow, left eye, right eye, nose, and jaw. 


3.5. Feature extraction and image classification using deep learning multi-task cascaded convolutional 
neural networks 
The following procedure provides a clearer description of the three steps of multi-task cascaded 

convolutional neural networks (MTCNN) [24]: 

a. The MTCNN initially produces numerous frames that scan the complete picture from top left corner to 
bottom right corner, starting from top left corner and finally moving towards bottom right corner. The 
proposal network (P-v Net), a shallow and fully linked CNN is used for information retrieval. 

b. The second stage involves feeding the refinement network (R Net), a fully linked and complicated CNN 
that rejects the majority of frames that don't include faces with all the data from the P-Net. 

c. The third step uses a more advanced CNN called output network (O-Net), which, as its name implies, outputs 
the facial landmark location after identifying a face in the provided picture or video [25], as seen in Figure 4. 
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Figure 4. The three stages of MTCNN [16] 


This method increases processing power features by distributed every 10 features among single stages 
and sub windows. OpenCV and MATLAB were used to develop the cascade classifier. For identifying faces 
in photos, the Haar cascade classifier makes use of Haar characteristics. The cascade approach will be used to 
evaluate each sub window in accordance with its feature [26]. In Figure 4, the classifier starts the assessment 
and examines each sub-property. Window's the sub window will proceed through the phases if it receives a 
favorable classification (face). The negative sub window (not the face) will reject right away in the alternative 
scenario. By deleting nonface-related windows at the start, this technique will boost the detection power of the face. 


4. EXPERIMENTS RESULTS 

The experimental results show how well the suggested behavioral data distribution service (DDS), 
which is based on a digital camera and the MTCNN deep learning algorithm, performs. The system can 
consistently detect sleepy states thanks to its great accuracy and precision, which reduces the likelihood of false 
alarms. The system may probably correctly identify most cases of sleepiness, according to the good recall rate, 
which lowers the possibility of false negatives. The global face features, such as the positions of the left and 
right eyes, nose, and corners of the mouth, can be determined by using the depth cascading multitasking 
MTCNN framework, which allows for simultaneous face detection and alignment. The internal relationship 
between the two is also exploited to improve performance, and to find drowsy you use to tow condition. 


4.1. Normal detection case open eyes split lips 

In this case, the eye is open and the mouth is closed. This stage is called awake or normal. Samples 
were taken as shown in the Table 1. It is worth noting that these samples in the mentioned case were taken in 
good lighting. The first scenario represents the test sample results and it turns out that the threshold value is 
47. Figure 5 also shows the ratio of the mouth to the eye MOE, which shows that when the mouth is closed 
and the eye is open, the ratio's value is greater than the threshold value in the table. 


Table 1. Normal detection case open eye close mouth 
Samples Number of monitoring _ Number of detections _ Threshold _ MOE 


10 10 9 47 46 
10 10 10 47 49 
10 10 10 47 50 
10 10 10 47 52 
10 10 10 47 53 
10 10 10 47 54 
10 10 10 47 55 
10 10 10 47 56 
10 10 10 47 56 
10 10 10 47 57 
Accuracy 100% 99.9% 
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Figure 5. The ratio of MOE for the first case 
4.2. Drowsiness detection case close eye close mouth 


In this case, the eye is closed and the mouth is closed. This stage is called drowsiness. Samples were taken 
as shown in the Table 2. It is worth noting that these samples in the aforementioned case were taken in good lighting. 


Table 2. Drowsiness detection case close eye close mouth 
Samples __No. of monitoring _ No. of detection Threshold | MOE 


10 10 10 47 39 
10 10 10 47 40 
10 10 10 47 41 
10 10 10 47 44 
10 10 10 47 44 
10 10 10 47 45 
10 10 10 47 45 
10 10 9 47 49 
10 10 10 47 43 
10 10 10 47 44 
Accuracy 100% 99.9% 


In the second case, the results from the tested samples are shown, and it turns out that the value of 
threshold=47, and the ratio of mouth to eye appears in the above values, where it turns out that when the eye 
is closed and the mouth is closed, its value is less than the value of threshold as shown in Figure 6. When a 
comparison is made to the performances of DDS with previous works in terms of accuracy, it can be seen that 
a very high accuracy is noticed as shown in Table 3 with other two references. 
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Figure 6. The ratio of MOE for the second case 
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Table 3. A comparative works 


Ref. Strategic Percentage (%) 

[8] | Use smart glasses-based drowsiness-fatigue-detection and cloud platform 98.4 

[9] Use novel algorithm for detecting faces (KCF and deep learning networks) 93 
Proposed method uses digital camera based on deep learning 99.9 


4.3. Normal detection open eye close mouth 

Additionally, we described how this impact would be seen on the detection of the eyes and lips when 
awake, but this time in dim illumination. According to Table 4, the findings from the tested samples reveal that the 
threshold value is 44, and the ratio of the mouth to the eye is shown in Figure 7. It comes out that when the mouth is 
closed and the eye is open, the ratio's value is lower and equal to the threshold value as shown in the table. 


Table 4. Bad light case open eye close mouth 


Samples No. of monitoring No. of detection Threshold MOE 
10 10 9 44 40 
10 10 10 44 48 
10 10 10 44 46 
10 10 10 44 44 
10 10 10 44 45 
10 10 10 44 47 
10 10 10 44 41 
10 10 9 44 52 
10 10 10 44 53 
10 10 10 44 51 

Accuracy 100% 99.8% 
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Figure 7. The ratio of MOE for the second case 


4.4. Case 4 drowsiness detection case close eye close mouth 

Both the mouth and the eye are closed in this instance. Drowsiness is the term for this phase. As 
indicated in Table 5, samples were collected. The fact that these samples in the aforementioned situation were 
obtained in poor lighting should be noted. The results from the tested samples are displayed in this case and it 
turns out that the value of threshold=44. The ratio of the mouth to the eye is also shown in the values, and it 
turns out that when both the mouth and the eye are closed, the ratio's value is lower than the value of the 
threshold as shown in Figure 8. 

Now, a discussion of the collected results from both cases and a comparison to earlier efforts. The 
first scenario demonstrates the result that is dependent on the driver being blind and calculates the impact of 
each EAR and mouth aspect ratio (MAR) threshold. It was observed that the suggested model had 
improvements. This model has drowsiness detection capabilities and can notify the driver with a notice. When 
compared to other research, the model used in the suggested approach is also one from a majority effective 
way for detecting drowsiness. According to the preceding section, the outcome of the detection of drowsiness 
in this situation (without glasses) may reach 99% for both the EAR and MAR. 
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Table 5. Drowsiness detection case close eye close mouth in bad light 
Samples Number of monitoring No. of detection Threshold MOE 


10 10 10 44 31 
10 10 10 44 32 
10 10 10 44 33 
10 10 9 44 45 
10 10 10 44 36 
10 10 10 44 36 
10 10 10 44 36 
10 10 9 44 47 
10 10 10 44 37 
10 10 10 44 37 
Accuracy 100% 99.8% 
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Figure 8. Case of drowsiness detection, close eye close mouth 


5. CONCLUSION 

One of a major contribution in this research has a generation for the new benchmark for a system for 
driver drowsing detection because of the drowsiness risk. It is substantial for the driver to use safer schemes, 
like staying away from driving at night, avoiding medication and alcohol, acquiring a good amount of sleep, 
and drinking more caffeine. One of the most significant strategies for reducing traffic accidents is to do research 
on sleepy driving detection algorithms. We present a novel driving drowsiness detection method that takes 
different variations into account in this study. To acquire a face of the driver in real-time video, first created a 
MTCNN model, that eliminates the method to fake feature removal in standard a face identification technique. 
Face recognition accuracy can approach 99.99%, according to experimental data. EAR and MAR as 
measurement. Whereas EAR, a different parameter depended on the Dlib toolkit, was proposed to analyze the 
condition to the driver's eyes, experiments demonstrate that there is the significant link among EAR as well as 
a value for the driver eye. While MAR, it's also a different parameter depended at a Dlib toolkit for a purpose 
for analyzing a condition to the driver mouth. Because of the experiment demonstrating that there is substantial 
link between the MAR and the effect the mouth of the drivers on the eye. Because of the muscle twitch of the 
muscles in the event of opening and closing the mouth on the face, demonstrating the logic of this theory. 
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